Corpus: oci_wikipedia_2021_10K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 93 95 99 99 99
1000 836 955 992 997 999
10000 6511 8867 9660 9892 9935
100000 6511 8867 9661 9893 9936
1000000 6511 8867 9661 9893 9936


Zipf's diagram for sentence endings


Gnuplot diagram

1496 msec needed at 2024-11-12 02:01